Disinformation on the Web: Impact, Characteristics, and Detection of Wikipedia Hoaxes

نویسندگان

  • Srijan Kumar
  • Robert West
  • Jure Leskovec
چکیده

Wikipedia is a major source of information for many people. However, false information on Wikipedia raises concerns about its credibility. One way in which false information may be presented on Wikipedia is in the form of hoax articles, i.e., articles containing fabricated facts about nonexistent entities or events. In this paper we study false information on Wikipedia by focusing on the hoax articles that have been created throughout its history. We make several contributions. First, we assess the real-world impact of hoax articles by measuring how long they survive before being debunked, how many pageviews they receive, and how heavily they are referred to by documents on the Web. We find that, while most hoaxes are detected quickly and have little impact on Wikipedia, a small number of hoaxes survive long and are well cited across the Web. Second, we characterize the nature of successful hoaxes by comparing them to legitimate articles and to failed hoaxes that were discovered shortly after being created. We find characteristic differences in terms of article structure and content, embeddedness into the rest of Wikipedia, and features of the editor who created the hoax. Third, we successfully apply our findings to address a series of classification tasks, most notably to determine whether a given article is a hoax. And finally, we describe and evaluate a task involving humans distinguishing hoaxes from non-hoaxes. We find that humans are not good at solving this task and that our automated classifier outperforms them by a big margin.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Antisocial Behavior on the Web: Characterization and Detection

Web platforms enable unprecedented breadth and speed in transmission of knowledge, and allow users to communicate and shape opinions. However, the safety, usability and reliability of these platforms are compromised by the prevalence of online antisocial behavior, for e.g., 40% of users have experienced online harassment [3]. Antisocial behavior is present in the form of antisocial users, such ...

متن کامل

Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore

Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...

متن کامل

What Is Disinformation?

LIBRARY TRENDS, Vol. 63, No. 3, 2015 (“Exploring Philosophies of Information,” edited by Ken Herold), pp. 401–426. © 2015 The Board of Trustees, University of Illinois Abstract Prototypical instances of disinformation include deceptive advertising (in business and in politics), government propaganda, doctored photographs, forged documents, fake maps, internet frauds, fake websites, and manipula...

متن کامل

Comparison of the Social Impact of Review Articles with Original Re-search Articles Indexed in the Web of Science in Pharmacy, Biology, Psy-chology, and Agriculture fields

Background and Aim: The last two decades have witnessed efforts to identify ways and tools of showing the value of science for society known as the social impact of science, the efforts that have been made under various titles such as social benefits, social quality, social utility, social relevance, and so on. Academic publications, especially academic articles, are objective representation of...

متن کامل

Conceptual Relationships and Diffusion Model of Information, Misinformation and Disinformation

Background and Aim: Proper management of the information process requires considering various definitions and combinations of the term "information". The purpose of this study was to clarify the concepts of information, misinformation and disinformation, and to better understand the ways of sharing, differentiation and relationships between them, and to explain the patterns and motivations for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016